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Abstract 

We present a simple and generic way to reason about name 
binding. Name binding is an essential component of every 
nontrivial programming language, matching uses of names, 
references, with the things that they name, declarations, based 
on scoping rules defined by the language. The definition of 
name binding is often entangled with the language-specific 
details, which makes abstract and comparative analysis of 
competing designs challenging. We present a framework that 
allows to abstract the fundamental notions of references, dec- 
larations, and scopes, and to express scoping rules in terms 
of four scope combinators and three properties of a specific 
programming language encapsulated in a concept named 
Language. Using this framework, we clarify complex scop- 
ing rules like argument-dependent lookup in C++, investigate 
the implications of the concepts feature for C++, and introduce 
a novel scoping rule named weak hiding. In an ideal world, 
specifications could be formulated based on our framework, 
and compilers could use such formulation to unambiguously 
implement name binding. While our examples are primarily 
centered around C++ and lexical scoping, our framework has 
applications in other languages and dynamic scoping. 

Categories and Subject Descriptors D.3.3 [Programming 
Languages]: Language Constructs and Features — Frameworks 

General Terms Languages, Framework, Binding, Scope, 
Lookup, Resolution 

Keywords Concepts, C++, Name Binding, Name Lookup, 
Name Resolution, Scoping Rules, Combinators 

1. Introduction 

In programming languages, name binding refers to the pro- 
cess of matching uses of names, references, to the things that 
they name, declarations, based on a set of scoping rules 1111121 
[32j [34j [4TJ [42). This process is essential to every nontrivial 
programming language, but how a language defines it can be 
cumbersome to understand and analyze. It is typical for pro- 
gramming languages to define it as part of their (standard) 
specifications, but such documents can be rather lengthy and 
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complex to sort through, especially when it comes to ex- 
tracting the language-specific definition of name binding. 
Usually, the definition of name binding is either implicit or 
tightly intertwined with the definitions of other features. 
From one language to another, the definition of name bind- 
ing may even use different terminology; making it difficult to 
understand exact differences. Similarly, it is difficult to assess 
how proposed features such as concepts |4 19 47|, argument- 
dependent lookup (in C++ |5|) or type-directed name resolution (in 
Haskell |30|) affect name binding. 

To simplify such challenges, we present a framework for 
understanding and specifying name binding independently 
of language-specific details. Our framework abstracts the 
fundamental notions of references, declarations and scopes, de- 
coupling the expression of scoping rules from their execu- 
tion. For different languages, features, or even kinds of ref- 
erence, we express scoping rules using a minimal set of four 
scope combinators that we have defined, plus three proper- 
ties of name binding that we have identified. We encapsulate 
the said properties in a concept named Language, instances 
of which allow to express salient differences in scoping rules 
between different languages. 

We developed this framework in an effort to understand 
how a proposed design for concepts would affect C++ and 
to propose an extension to name binding that we found nec- 
essary for concepts. This extension turned out to be a new 
scoping rule, called weak hiding, with applications in other 
languages that support any form of constraints-based poly- 
morphism likened to concepts, like Haskell. The framework 
allowed us to not only describe our extension, but to explore 
how it could be integrated in current compilers. In particu- 
lar, we ended up with a novel mechanism for name binding 
called two-stage name binding (Bind* 2 ) that essentially con- 
sists of iteratively applying current mechanisms for name 
binding, under different contexts. 

The rest of this chapter is structured as follows. In the next 
section, we revisit our problem statement, describing the is- 
sues th at this framework resolves in greater detail. Then, in 



Sect. 3 we introduce ou r framew ork, with its generic defini 



tion ot name binding. In Sect. 4 we cover applications of the 
framework which range from ex pressing c omplex features 
like argument- dependen t lookup jSect. 4.1) and uses of op- 
erators in C++ {Sect. 4.2} , to expressing concrete differences 
betwe en variou s languages, features, or even kinds of refer- 
ences jSect. 4.3) , to exploring compiler integrations and to 
concretely motivating our proposed weak hiding and Bind* 2 
{Sect. 4.4) . A discussion on background, ong oing and related 
work follows in ISect. 51 and we conclude in ISect. 61 An im- 



plementation of our framework is available online |6|. 
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cf. Fig. l| for 
eke 



2. Problem Statement 

Consider a call to some function named move, 
example, and attempt to explain how the call is type-checked 
in three different languages: C++ |5 51 1, Haskell |30|, and 
Common Lisp |45|. Assuming basic understanding of pro- 
gramming languages theory, the explanation quickly fol- 
lows drastically different paths, including different choices 
of terminology. On the one hand, the C++ path requires un- 
derstanding features like templates, concepts, (non) depen- 
dent names, overload resolution, namespaces, using dec- 
larations, and at some level, different compilers and their 
interpretations of the specification, e.g., GCC and Clang. 
On the other hand, the Haskell path requires understand- 
ing the Hindley-Milner type system, type classes, type infer- 
ence, dependency analysis, dictionary-passing style, mod- 
ules, import declarations, and at some level, also different 
compilers, e.g., GHC, Hugs and NHC. On the third hand, the 
Common Lisp path requires, among others, understanding 
dynamic scoping as enabled via the defvar and declare 
constructs. 

Programming languages typically define the features that 
they support in a specification document, which includes a 
definition of name binding as well as the scoping rules to 
apply for various kinds of references. For example, a spec- 
ification document may dictate how to bind uses of types, 
functions, or record field names; and may do so differently 
for each case. A recurring issue with the specifications, how- 
ever, is that they tend to be lengthy and too complex to un- 
derstand. For example, in C++, the definition of name binding 
spans through about four chapters of the specification docu- 
ment and contains links to various other parts of the docu- 
ment. In Haskell, while the specification document may not 
be as lengthy, the sections that cover name binding also con- 
tain links to other parts of the document. We can only antici- 
pate similar observations within other languages, depending 
on their practical use. 

The difficulties with extracting the meaning of name bind- 
ing from language specifications increases the challenges in 
understanding and expressing the binding of various names. 
In fact, we have noticed that production grade C++ com- 
pilers like Clang |8 |, GCC (13), the Intel compiler 02, and 
Microsoft Visual C++ l3Tl i implement uses of operators and 
argument -dependent lookup ( ADL) either incorrectly or incon- 
sistently l |Sects. 4.1 and 4.2} . 

Adding new features to a language makes understanding 
name binding even more challenging, since the addition may 
affect the meaning of name binding in unanticipated ways. 
For example, ADL in C++ is intended to facilitate uses of op- 
erators defined in different namespaces. However, after sev- 
eral revisions of the standard, it has become a classic exam- 
ple of a feature with very complex implications. As of today, 
for nearly every new proposed feature, compatibility with 
ADL has become a standard concern |2 3 16 49 1. Moreover, 
even uses of operators have gained complexity with ADL; 
and that is in addition to the fact that they follow slightly 
different scoping rules than function calls. 

2.1 Towards the Name Binding Framework 

We started looking at specifications for name binding in an 
effort to help design concepts for C++ using ConceptClang 
1531 154 1. Essentially, we were investigating how a function 
call like move ( ) should be type-checked in a constrained 
template. |Fig. lb illustrates the exact problem that we were 
facing. It turns out that when C++ migrates into Con ceptC++ 
(i.e. C++ with concepts), valid programs such as in Fig. la 



would no longer b e consid e red vali d. This is because the 
requir es clause i n Line 3 of Fig. lb restricts the bin ding of 



Lines 8-9 in ways that invalidates the call in |Line 9| 
Namely, without constraints I Fig. la Line 9} , concepts are 
implicit; and with such, all function calls in C++ templates 
look up declarations in external scopes and are bound via 
overload resolution — success fully in this example. In con- 
trast, with explicit concepts i Fig l b |, de clarations in outer 
scopes are hidden by constraints |Lme~3) , and the function 
call in lLine 9l failsF1 

Maintaining the backward compatibility of programs as 
new features are added in a language is very important in 
sof tware p ro duction, and of particular importance to C++. 
So, |Line 9 of Fig. lb illustrates a problem that was warrant 
a solution. We actually came up with a solution, but ex- 
plaining the problem and describing our solution in concrete 
terms turned out to be challenging and easily prone to bike- 
shedding into other language details. Thus, we developed 
the fram ework t hat is the essence of this chapter (and intro- 
duced in |Sect. 3) . Using our framework, we are now able to 
formulate our iae 



leas as follows. 

rotate <\P<\T<\ (N <=> U), 

rotate < P < (C T <=> T) <\ {N <=> U) , and 

rotate < P < (C T <=> T) <=, (N <=> U), 



(1) 
(2) 
(3) 



Eqs.rilandpp escribe the issues at hand, i.e., with Line 9 of 
Figs. la"|and IB| respectively; while Eq. 13] describes our pro- 
posed solution, rotate represents the block scope of the func- 
tion template rotate; CV holds all declarations correspond- 
ing to the constraint C<T>; P , T, and N, respectively, rep- 
resent the scopes of function parameters, template parame- 
ters, and surrounding namespaces (possibly more than one, 
assuming that the example represents only a fragment of a 
complete program); and U represents the scopes imported 
by declarations in the surroun ding na mespace. 

The problem that we had jFig. lb} , where uses of outer- 
scope declarations are invalidated by explicit constraints, 
was essentially a scoping rules problem; and to resolve it, 
we have identified four elementary scoping rules that we 
named hiding, merging, opening and weak hiding, and respec- 
tively denoted with symbols <i , <=> , i> and <=i . The first 
two rules, hiding ( <i ) and merging ( <=> ), correspond to the 
most common forms of scoping rules. Given two scopes A 
and B, A<\B corresponds to searching for declarations in B 
only if no match is found in A, and A <=> B corresponds to 
searching for declarations in both A and B as if the scopes 
were merged into one. We introduce opening ( > ) as neces- 
sary for expressing ADL in C++, and propose weak hiding ( <=i ) 
as a new scoping rule for checking name uses in constrained 
environments — or restricted scopes. One may think of opening 
as a dual of hiding, that has A > B correspond to searching 
for declarations in B when a match is found in A; and of 
weak hiding as a sweet middle between hiding and merging, 
that has A <=i B correspond to searching for declarations in 
B only if no viable match is found in A. 

To express the subtle distinctions between all four scop- 
ing rules, we had to take a closer look at the process of name 



1 Our analysis is based on a proposal for C++ concepts dubbed pre- 
Frankfurt 1 5 20 1. In a recent proposal dubbed Palo Alto | 47], the body 
of constrained templates is checked using a procedure called expres- 
sion validation, which has not yet been specified. When specified, 
such a procedure can only lead to a program behavior isomorphic to 
that of the pre-Frankfurt design. Thus, we consider the descriptions 
herein provided as implicitly applying to the Palo Alto alternative. 
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1 template<typename I> 

2 // implicitly requires RandomAccessIterator<I> 

3 // and Permutable<I> , with move (ValueType<I>&&) 

4 I rotate ( I f , I m, I 1 ) { . . . 

5 I p = f; 

6 ... 

7 if ( is_pod (ValueType<I>) ...) { 

s ValueType<I> t = move (*p) ; 

9 move(p + l, p + l-f, p) ; // OK. 

10 ... 

11 } 

12 ... 

13 } 



(a) With implicit concepts in C++ 



1 template<typename I> 

2 requires RandomAccessIterator<I> && 

3 Permutable<I> // => move (ValueType<I>&&) 

4 I rotate (If, I m, II) { . . . 

5 I p = f ; 

6 ... 

7 if ( is_pod (ValueType<I>) ...) { 

s ValueType<I> t = move(*p); 

9 move(p+l, p+l-f, p) ; ' // Wot OK? 

10 ... 

11 } 

12 ... 

13 > 



(b) With explicit concepts in ConceptC++ 



Figure 1: Name lookup and resolution with and without concepts 
This program is adapted from the latest release (4.6) of the GNU C++ Library |TS1 . It is a fragment of a specialization of the 
rotate () algorithm, from the standard template library |4 33 46], f or rand om access iterators [14 1. We have removed the 
std: : qualifiers for simplicity. One-argument calls to move, such as in |Line 8| are supposed to be covered by the constraint in 
|Line 3| while three-argument calls to move are bound to an outer scope declaration that is properly constrained. 



binding itself as it executes each one of the scoping rules; 
our goal being to break it down into appropriate stages. We 
ended up with a view of name binding as a composition of 
name lookup and name resolution, where name lookup gath- 
ers all declarations matching a given reference, while name 
resolution reduces the matches down to the best viable dec- 
laration. With this compositional view, we note that usually, 
only name lookup determines which declarations to hide or 
merge. But, it is only with weak hiding that name resolu- 
tion also determines which declarations to hide (because the 
check whether a matching declaration is actually viable for 
a reference happens during name resolution). Without weak 
hiding, a non-viable declaration may hide a viable declara- 
tion; while with weak hiding, only viable declarations get to 
hide other declarations. 

Indeed, our proposed soluti on (Eq.[3 ) uses weak hiding; 
and with that, the program in |Fig. lbj is no longer invalid 
because name binding does not halt when it finds the con- 
straints and fails, but rather moves on with looking up names 
in outer scopes, where it finds a viable match for the call in 
ILine 91 

2.2 Two Layers of Abstraction 

Our framework defines the denotations <\, <=*> , > and <=i as 
combinators acting on scopes — or scope combinators, based on 
a compositional view of name binding a s na me lookup fol- 
lowed by name resolution. This said, Eqs. l|3]are scope expres- 
sions that can be executed by compilers cTfTHow the scope 
expressions behave depends on properties of the language 
that can be expressed in either one of two ways. On the one 
hand, one may 

• express the scoping rules using the scope combinators. 
On the other hand, one may 

• express other properties not captured by the combinators 
in terms of three salient properties, which are: 

■ how a language determines the potential matches, 

■ how it checks that a declaration is viable, and 

■ how it handles ambiguous matches (which may not 
always result in a failure). 

The first two properties specify how names are hidden, while 
the second one indicates support for overloading (a.k.a. 
ad-hoc polymorphism that is not based on type-class con- 
straints), viability checking, or related variants. Based on 
these properties, we can express other points of differences 
in name binding beyond how the scopes are combined. In 



particular, we can clarify aspects of proposed features like 
multimethods 1 35 1 for C++ or TDNR for Haskell; providing 
a different way to explain how TDNR is not overloading. 
We can also explore possible effects of proposed features like 
weak hiding on languages like Haskell. Sect. 4.3 covers these 
examples. 

For generality purposes, we have encapsulated the above 
salient properties in a concept named Language.To use our 
framework, one must instantiate this concept for different 
references, declarations, features or languages. 

Naturally, the Language concept, as a generic tool, brings 
in a range of possibilities for additional abstractions and ex- 
perimentations. In particular, it has allowed us to investigate 
different ways in which one could integrate our combinators 
in current compilers (6). A notable outcome of this investi- 
gation is a novel mechanism for name binding, implement- 



ing weak hiding, that we call two-stage name 



(Bind* 1 ). 



Bind* 2 is designed to facilitate the integration of weak hid- 
ing in current compilers and has been implemen ted for C++ 
concepts (53). We briefly introduce it in Sect. 4.4 and a more 
thorough definition and description of Bind' falls outside 
the scope of this paper. 

3. The Name Binding Framework 

We begi n with a general definition of name binding. As 
stated in Sect. 1 it is a process that matches a reference to a 
declaration, it possible. As such, it takes in a representation of 
a reference, Ref, including any contextual information about 
the reference, and returns either a declaration, Decl, or an 
error, Error. One essential contextual information of the ref- 
erence is a scope, Scope, which we single out from all the 
other information for genericity purposes. That said, we de- 
fine name binding as a function 

bind : Ref X Scope Ref Decl — > (Decl + Error) 

such that 

bind (ref, scope) — ((resolve ref) o (lookup ref)) scope; 

where lookup is defined as a function from a reference and a 
scope to an overload set; and resolve is defined as a function 
from the overload set to, potentially, a declaration. We rep- 
resent scopes as mappings from references to sets of match- 
ing declarations. Thus, given a reference ref, lookup queries 
a scope for its matching declarations, a.k.a. an overload set. 
In other words, using Haskell as specification language, we 
could define a scope (Scope) as a data structure 
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1 data Scope ref decl = 

2 Scope { lookup : : ref — > OverloadSet ref decl } 

in which lookup (lookup) is considered a property of scopes, 
and an overload set (OverloadSet) is defined as a data 
structure 



1 data OverloadSet ref decl 

2 result : : Set decl 

3 , null : : Bool 

4 , resolve : : ref — > Maybe decl 

5 , merge : : OverloadSet ref decl 

OverloadSet ref decl 

6 } 



OverloadSet { 



where result, null, resolve, and merge are properties 
of overload sets, respectively representing 

• the result of name lookup, 

• whether name lookup found any matching declaration, 

• how to resolve the lookup results down to a single decla- 
ration, and 

• how to merge a given overload set with another. 

In this definition, Scopes are considered entities that can 
be combined into other Scopes based on some sco ping rules 
defined by the language. In the next sections {Sects. 3.T 
and 3.2) , we define four (4) scope combinators: three (3) 
based on scoping rules as we currently know them, i.e., hid- 
ing ( <i ), merging ( <=> ) and opening ( > ), and one based on 
our novel scoping rule, weak hiding ( <=i )■ 



3.1 The Scope Combinators, Currently 

Consider the function calls in Lines [4] [13] and [25] of the C++ 

Respectively, they illustrate the hiding, 



Fig. 2 



prograrr 

merging and opening scoping rules, which we define as scope 
combinators 

< , <=> , > : Scope Ref Ded x Scope Ref Decl ->• Scope Ref Ded 
Si <=> S2 = lookup^ Si U lookup^ Si 
Si <i S2 = lookup^ Si ? lookup^ Si : lookup^ S2 
Si > S2 = lookup^ Si ? lookup^ S2 : empty 

using C++'s conditional operator, ? : , for simplicity. 

By the general definition of combinators, each scope com- 
binator, i.e. <*=> , < , and > , takes two scopes and returns a 
new scope. Given a reference ref, merging ( <t=> ) joins the re- 
sults of name lookup in both scopes, while hiding ( < ) and 
opening ( > ) decide whether to perform name lookup in the 
second scope, S2, based on whether name lookup in the first 
scope, Si, is successful. If lookup is successful in Si, then hid- 
ing considers only the first scope, ignoring the second scope. 
In contrast, opening considers only the second scope, ignor- 
ing the first one. If lookup is not successful in Si, then hiding 
considers the second scope; while opening considers the first 
one, which happens to be empty. In the above definition, we 
write empty solely for clarification purposes. 

Using these combinators, one may express the scoping 
rules for each call, respectively, as 

test <i :: , (4) 

(test <?=> ns) <i :: , and (5) 

(test <=> ns <s> (ns > (test < adl))) < (:: <=> adl) , (6) 

where test represents the block scope of the functions 
test_hiding, test_merging and test_opening, ns 
and adl represent separate namespace scopes, and :: rep- 
resents the outer (namespace) scope. 



i void f oo ( ) , 

2 

3 void test. 

4 f oo ( ) ; 

5 } 



hiding ( ) { 



7 namespace ns { 
s void foo(int); 

9 } 

10 

n void test_merging ( ) { 

12 using ns: :foo; 

13 f oo ( ) ; 

14 } 
15 

16 namespace adl { 

17 struct X { } ; 

is void f oo (typ) ; 
19 } 

20 

21 void test_opening ( ) { 

22 adl : : X x; 

23 using ns::foo; 

24 // void foo ( ) ; 

25 foo (x) ; 

26 } 



The hiding and merging 
rules are the most common 
in programming languages. 
While hiding corresponds to 
name shadowing, merging cor- 
responds to the union of differ- 
ent scopes which we typically 
observe with features like mix- 
ins, module imports, or C++'s 
using declarations. Fo r exam- 
ple, for the first call jLine 4[ 
Eq. HI, lookup does not find 
anything in test and thus pro- 
ceeds to the outer scope where 
it finds the declaration in lLine II 
Later, resol ve finds the found 
declaration {Line 1) viable for 
the call, and the call succeeds. 
In contra st, for the second call 
{Line 13 Eq. |5j, lookup 



finds 



lookup 

the declarationm lLine 8| which 
resolve does not find viable; 
Figure 2: Using hiding, merg- and the cal1 fails - lookup does 
ing and opening in C++ not consider the outer scope 

because the found declaration 
jLine 8) ha s been merged into test via the us ing declaration 



Line 12 and thus hides the declaration in the outer scope. 
The opening rule is, interestingly, not trivially observed. 
In fact, we found it n ecessary t o define only so that we can 
describe ADL in C++ {Sect. 4.1) . One may view it as a dual 
of hiding since its behavior, under the same condition, is 
opp osite to th at of hidi ng. For example, consider the third 
call jLine 25| Eq. [6} in |Fig. 2| ADL dictates that, if a call 
argument (e.g., x) is such that its type (e.g., x) is defined in 
a given namespace (e.g., adl), then that namespace should 
also be taken into consideration by name lookup; but that 
is only if there is a matching declaration and no matching 
block scope declaration. As a result, lookup finds and merges 
all declaratio ns from s copes test, ns and adl; and since the 



declaration in [Line 18 is the only one viable fo r the call, the 
call succeeds; but th at is only because Line 24 is commented 



out. Uncommenting Line 24 causes the declaration in test to 
hide declarations in adl, as we ll as thos e in the outer scope. 
In other words, uncommenting Line 24 disables ADL if there 
is a match in ns, which causes the call to fail. 

To simplify expressions that would use these combi- 
nators - i.e. scope expressions, as well as to reflect general 
practices, we make the combinators right-associative, and 
give merging a higher precedence than the other combi- 
nators. So, an expression si < S2 <j=> S3 < s 4 is equivalent to 

Sl < ((S2 <t*» S3) < Si). 

Note that both hiding and opening are actually special cases 
of a more general (ternary) combinator that one could define 
as follows (with Scope RD = Scope Ref Decl ): 

<?> : Scope RD x Scope RD x Scope RD — > Scope RD 
So <?> Si S2 = lookup . So ? lookup r ^ Si : lookup^ S2 . 

For simplicity and practical reasons, we do not focus on this 
generalized definition in this paper. 

3.2 Weak Hiding, a New Scoping Rule 

With the current rules, only name lookup, and never name 
resolution, participates in the dec ision of w hich scope to pro- 
cess. As a result, as illustrated in |Fig. lb] programs are con- 
sidered invalid which, at least for practical purposes, should 
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1 concept SweetAddable<Semiregular T> = 

2 requires (const T& a, T&& b) { 

3 T&& = a + b; 

4 TSS = b + a; 

s }; 

6 

7 template<typename T> 

s requires . . . std: : SweetAddable<T> . . . 

9 void test(T&& a, T&& b) { 
id T si; 
n T s2; 

12 T s3 = si + T() + T() + s2; // Not OK? 

13 } 

Figure 3: Ambiguity Resolution in ConceptC++ 

not be. To resolve this issue, we introduce weak hiding, to al- 
low name resolution to participate in the decision as well. 
This said, we define it as scope combinator 



Ref.Decl 



<=i : Scope Ref Decl x Scope Ref Decl Scope 

si <n S2 = {resolve re f o lookup r ^ Si ? 

lookup^ Si : lookup^ S2 

with the same precedence and associativity properties as 
established on the earlier combinators. 

One can view weak hiding as a sweet middle between hid- 
ing and merging where a scope is processed conditionally on 
whether name resolution over the result of name l ookup is 
successful. For example, consider the second call {Line 13 ' 



Eq. [5j in |Fig. 2[ If the relationship between test and :: were 
weakniding instead of hiding, i.e., with the scoping rules ex- 
pressed as 

(test <=> ns) <n :: , 
then lookup would consider th e outer scope after resolve 
fails to find the declaratio n in iLrne 81 viable for the call. 
Then, as with the first call 1 Line 4 Eq. 4|, it would find the 
declaration in |Line 1 viable; and the calfwould succeed. 

Similarly to the hiding scoping rule, we could general- 
ize the definition of weak hiding into a ternary combinator; 
thereby allowing one to extend the reasoning for opening into 
a "weak opening" scoping rule. But, for the same practical pur- 
poses, let us not venture that way just yet. 

Parameterized weak hiding: The above definition of weak 
hiding uses the inherent properties of name resolution in a 
given language to determine which scope to process. De- 
pending on the intended use of weak hiding, this may lead 
to undesirable program behavior. Consider, for instance, re- 
solving ambiguities in an overload set. In general, if an over- 
load set contains more than one best viable declarations, then 
this is considered an ambiguity and name resolution fails, 
which results in an error. For example, take the program in 
Fig. 3 which illustrates an issue that has been instrumental in 
making a number of design decisions for C++ concepts 1 18^] 
This example uses C++ll's lvalue and rvalue references, inves- 
tigating how move sema ntics affe cts concepts and vice- versa. 

For the function call in Line 12 the re are at least two possi- 
ble matches in the constraints {Line 8) . The function call itself 
has two rvalue arguments, but the matching declarations ex- 
pect one lvalue argument and one rvalue argument, each in 
different orders. With the current meaning of name resolu- 
tion, after lookup finds the matches in the constraints, resolve 



finds them all to be best and equally viable for the function 
call; and thus fails. Using the scope expression in Eq.|3](with 
rotate replaced with test), when name resolution fails given 
the matches in the constraints, weak hiding prompts lookup 
to process the outer scope; at which point at least two un- 
desirable things may happen. On the one hand, the function 
call may eventually fail from failure to find a single best vi- 
able declaration. On the other hand, it may actually succeed, 
but with binding to the wrong declaration. 

In practice, one may not want name resolution to fail, 
given the matches in the constraints, if there is a guarantee 
that the instantiation of the function template test will be 
successful. Instead, one may want to take advantage of the 
full expressive power of concepts without liability, and al- 
low weak hiding to change the meaning of ambiguity if the 
scope b eing pr ocessed is restricted. In particular, for the pro- 



gram in |Fig. 3| it may be reasonable to define weak hiding in 
such a way that whenever the scope being processed is re- 
stricted, ambiguities in overload sets are not considered to be 
errors; but rather result in some representative declar ation or 
"seed". With such definition of weak hiding, the call in Line 12 
succeeds and the program in Fig. 3 is considered valicT. 

In more general terms, we would not want to sacrifice 
expressiveness for safety if we can avoid it; and to this end, 
we generalize the earlier definition of the weak hiding scope 
combinator as follows 

<m m : Scope RefjDecl x Scope Ref Dec , ->• Scope RefjDecl 



Si <=i w S2 = ( resolve re f ° [update o lookup A J Si 



update o lookup^ J si : lookup^ Sz 

where the subscript U indicates temporary changes in the 
properties of name binding — or hind environment, such as 
changes in the meaning of ambiguity, encapsulated in the 
function update. Note that update is identity if there is no 
change in the bind environment. 

U is a property of weak hiding that essentially acts as an 
additional parameter in this generalized definition of weak 
hiding. Therefore, we refer to the definition as parameterized 
by the bind environment. Indeed, Parameterized weak hiding 
allows properties of name resolution, including the meaning 
of ambiguity, to change based on the scope. 

3.3 The Language Concept 



2 The original issue was expressed using pre-Frankfurt pre-Frankfurt 
I5ll20l design syntax. This example re-expresses the issue using the 
Palo Alto ] 47 1 design syntax 



The combinators in Sects. 3.1 and 3.2| are not always sufficient 
to express differences in meanings or name binding. Another 
set of differences comes from observing the process of name 
binding itself, given a (combination of) scope(s). Through ex- 
perimentation, we have observed a recurring pattern which 
allows to define lookup and resolve generically, as follows 

1 lookup : : Language r d => r — > Set d — > OvSet d 

2 lookup ref decls = Set. filter (match ref) decls 

3 

4 resolve: : Language r d => r — > OvSet d — > Maybe d 

5 resolve ref decls = assess $ 

6 select_best_viable ref decls 
7 

s assess : : Ambiguity d =>- BVSet d — > Maybe d 

9 assess decls = case (Set.elems decls) of 

10 [ ] — > Nothing 

11 [decl] — > Just decl 

12 _ — > ambiguity decls 

where lookup and resolve respectively represent lookup 
and resolve, and scopes are represented as sets of declara- 
tions (Set). 
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In this recurring pattern, name lookup, given an encoding 
of how to test whether a given reference matches a given 
declaration, e.g., match, simply filters matching declarations 
out of a given scope. Name resolution, on the other hand, 
consists of a combination of: 

• selecting viable candidates out of an overload set, e.g., 
with select_best_viable, and 

• assessing the viable set based on the meaning of ambigu- 
ity, e.g., with assess. 

select_best_viable reduces an overload set (OvSet) 
down to only those declarations that are viable for a given 
reference (BVSet); and assess reduces the viable declara- 
tions down to either a single best viable declaration or an 
error (Maybe). Usually, a single or no viable declaration au- 
tomatically means success or failure of name binding, while 
the meaning of more than one viable declarations depends 
on that of ambiguity (ambiguity). 

Ambiguity does not have a concrete meaning at such a 
generic level, since it varies based on the lang uage, featu res, 
and perhaps even the scoping rule in use (cf. Sect. 3.2} . So, 
we abstract it into a concept named Amb i gu i t y , that is used 
to define assess. 

1 class Ambiguity d where 

2 ambiguity : : BVSet d — > Maybe d 

3 ambiguity _ = Nothing 

In general, ambiguity is considered to be an error. So, we 
define it as such by default. (Instances of Ambiguity may 
override the default behavior.) 

Both select_best_viable and match also do not 
have concrete meanings at this generic level. select_best- 
_viable usually indicates overloading in the sense of ad-hoc 
polymorphism (not based on type-class constraints) or vari- 
ants. For example, in Haskell - which does not support ad-hoc 
polymorphism, it may correspond to either identity or type 
inference depending on how one defines name binding; and 
each way implies a significant property for the language. 
Meanwhile, match usually indicates various ways of match- 
ing references to declarations by name, including any neces- 
sary normalization. For example, in C++, function calls and 
uses of types both match references to declarations the same 
way, but the normalization for uses of types (i.e. after looking 
up names of types) may perform ambiguity resolution which 
that for function calls does not (i.e. after looking up names of 
functions), match also highlights a filtering stage that TDNR 
adds to Haskell's name lookup without c hanging its mean- 
ing for s e 1 e c t_b e s t_v i ab 1 e . |Sect. 4.3] discusses different 
interpretations and related implications such as herein high- 
lighted. 

Since both select_best_viable and match act on 
pairs of references and declarations as defined by the lan- 
guage, we encapsulate them in a concept named Language. 

1 class Ambiguity d => Language r d where 

2 match : : r — > d — > Bool 

3 select_best_viable : : r — > OvSet d — > BVSet d 

All three functions match, select_best_viable and 
ambiguity encapsulate salient properties of elementary 
name binding as defined by a language; and we reflect that 
in the Language concept, which incidentally refines the 
Ambiguity concept in order to include the ambiguity prop- 
erty into the mix. Aside from these three functions, all other 
aspects of elementary name binding have meaning at the 
generic level and constitute the recurring pattern that we 
observed. 



The Language Concept, Extended: In the above definition 
of elementary name binding, the meaning of name resolution 
is constant throughout all scopes; and is thus o nly useful for 
applications of non-parameterized weak hiding {Sect. 3.2 1. To 
take advantage of parameterized weak hiding, name reso- 
lution must take into account potential changes in the bind 
environment. For example, if changes in ambiguity were the 
only changes that we were in terested in expressing (as is the 
case for the problem in |Fig. 3} , then one may redefine generic 
resolve as 

1 resolve : : Basic . Language r d 

2 BindEnv d — > r — > OvSet d — 5> Maybe d 

3 resolve env ref decls = assess env $ 

4 select_best_viable ref decls 

5 

6 assess : : BindEnv d — > BVSet d — > Maybe d 

7 assess env decls = case (Set.elems decls) of 
s [ ] — > Nothing 

9 [decl] — > Just decl 

10 _ — > ambiguity env decls 

where Basic . Language represents the above definition of 
the Language concept (for non-parameterized weak hid- 
ing); and BindEnv represents the bind environment and is 
defined as follows. 

1 data BindEnv d = BindEnv 

2 { ambiguity : : BVSet d — > Maybe d } 

In this updated definition, name resolution - and more 
specifically the assessment of viable declarations - now gets 
the meaning of ambiguity from the bind environment, in- 
stead of the language as previously. Indeed, the new defini- 
tion of assess is no longer constrained by the Ambiguity 
concept, but rather now takes in an additional input, env, 
which represents the bind environment. 

Given this update, we can specify whether and how the 
bind environment changes, by encapsulating the changing 
behavior in a concept named Parameterized that we then 
extend our Language concept with. Thus, we define an ex- 
tended form of the Language concept as 

1 class (Parameterized d, Basic . Language r d) =>■ 

2 Language r d where 

which refines the Parameterized concept, defined as 

1 class Parameterized d where 

2 get_env : : Maybe (BindEnv decl) 

3 get_env = Nothing 

where get_env represents the changing behavior. By de- 
fault, we assume no change in the bind environment, and 
reflect that in the definition of Parameterized. When the 
meaning of ambiguity changes, say to treat ambiguities as 
non-errors (amb_is_not_err), one may express that by in- 
stantiating the Parameterized concept as follows. 

1 instance Parameterized (WeakHiding d) where 

2 get_env = Just $ BindEnv amb_is_not_err 



4. Applications 

In |Sect. 2| we described issues that we have encountered that 
led to the development of this framework. We also high- 
lighted some positive progress that we have been able to 
make using this framework. For starters, besides uncovering 
misinterpretations of language specifications in current com- 
pilers, we can now clarify complex features like ADL and 
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uses of operators in C++, as well as differences between C++, 
Haskell, and proposed features like multimethods for C++ 
and TDNR for Haskell. T his section is intend ed to precisely 
cover such progress, with Sects. 4.1|and 4.2 showing appli- 

,|Sect. 4. 3\ showing appli- 
highlighting 



cations of our scope combinations, |Sect. 4.3 



cations of the Language concept, ana" beet 



4.4 



applications to compilers covered outside ot this paper. 



4.1 Understanding Argument-Dependent Lookup 



ADL (as specified in |51| clause 
3.4.2) makes additional declara- 
tions available for name bind- 
ing by introducing declarations 
from scopes where the types 
of the arguments are defined, 
shows an example of ADL, 



1 


namespace nsl { 




2 


struct X { } ; 




3 


void f oo (X) ; 




4 

5 


} 

namespace ns2 { 




6 


void f oo (int) ; 




7 
8 


) 

void bar (nsl : : X 


x) { 


9 


using ns2 : : f oo 




10 


foo (x) ; 




11 


} 




12 


void baz (nsl : : X 


x) { 


13 


void foo (int) ; 




14 


foo (x) ; 




15 


} 





Fig. 4| s 

si milar to the earli er example 
Fig. 2| {Line 25 b. As small 



Figure 4: ADL in C++ 



of a program as this example 
shows, we can already start to 
notice some inconsistencies in 
the ways that compilers inter- 
pret the specifications of ADL. 
Upon experiments with dif- 
ferent C++ compilers, namely Clang 4.1 |8|, GCC 4.8 1 13 J, 
the Intel compiler 13.0.1 |22|, Microsoft Visual Studio [31*1, 
and the Comeau compiler |9|, we find that when it comes 
to ADL, the compilers behave inconsistently, with the Intel 
compiler |22| differing from the other compilers. The exam- 
ple shows two functions, bar and baz, both calling a func- 
tionfoo on an argument of type nsl: : X. As per the example 
in |Fig. 2 ADL should be enabled in bar, leading to a success- 
ful call, a nd disabl ed in baz, leading to an error sta ting that 
the call on lLine 14l does not match the declaration on lLine 13l 
GCC 4-8, Clang 4-1 , and Comeau do indeed issue such error 
while the Intel compiler does not. 



1 void zet (nsl : : 

2 void foo (int 

3 foo (x) ; 

4 { 

5 using ns2 : : 

6 foo (x) ; 



X x) { 



Fig. 5 illustrates further com- 
plications with ADL. Assum ing 
~ the 

to 



the namespaces from |Fig. 4 

because 



foo; 



Figure 5: ADL in C++, ex 
tended 



first call to foo fails 
ADL is blocked by the declara- 
tion of foo on the line above. 
The call in the nested scope 
succeeds however. According to 
the ADL rules, if a name was 
found through a using declara- 
tion, then ADL proceeds. The first declaration of foo is hid- 
den by the using directive, so ADL succeeds in resolving the 
second call to foo. We can express this set of rules, more gen- 
erally, using the following scope expression^] 

f ADL (H) <| C=lfADL (Si) <| ^=l lfn ADL (Ni) <| 

( N n <=> Un„ <=> <=> U Nn ) <ADL)) 

when the name binding is triggered from a block scope, or 

fn AD L (H) < C=i lfn ADL (Ni) < 

(n„ <=> U N „ <=> ( (n„ <=> U N „) < ADL) ) 

3 represents an iteration of <| over values of i ranging from 1 
to n (inclusive). 



when it is triggered from a namespace scope, where Q 

f ADL (X) = A<=> 

U x <=> |lJx i> ^X«*Uxj<i ADL^ 
Si ■ ■ ■ S s — surrounding non-namespace scopes, 
fn AD L (N) = N<=>U N N U N )i>((N<=>U N )<|ADL 

Ni • • • N n = surrounding namespaces, 

H = scope where name binding is triggered from, 

X = non-function (template) declarations in scope X, 

Ux = using declarations in scope X, and 

ADL = associated namespaces of reference's arguments. 

This generalized expression covers most of the standard, ex- 
cluding qualified name lookup. The functions f adl and fnADL 
represent a further expansion of the scoping rules, which 
states that a declaration matching a function call is progres- 
sively searched for through block scopes, and then, equally, 
through all the surrounding namespaces, their using dec- 
larations, and associated namespaces (ADL). The distinction 
between f adl and fnADL indicates that function declarations 
in block scopes may hide associated namespaces but those 
in the surrounding namespaces cannot. Also, namespace 
scopes may enable ADL but block scopes with nousing dec- 
laration cannot. Finally, ADL can occur at block scopes even 
if one of the scopes disables ADL. 

The complexity of ADL is a prime example that shows 
how our framework can be used to capture complex designs, 
that normally require multiple paragraphs of specification 
in the standard, in a concise form that aids in analysis and 
understanding. 



4.2 Understanding C++ Operators 

The 



i struct X { } ; struct Y { } ; 



void operator+ (X, 
void operator+ (X, 



X) 
Y) 



{ } 
{ } 



void test (X x, Y y) { 
void operator + (X, X) , 



12 } 



operator + (x, 
operator + (x, 



x) , 

y) , 



Figure 6: Using C++ operators 



use of C++ opera- 
tors is another interest- 
ing portion of our ex- 
periments. At first, it ap- 
pears to be a rather sim- 
ple feature. However, not 
only does it quickly gain 
complexity with the ad- 
dition of ADL, but its 
specification appears to be 
widely misinterpreted by 
production quality compil- 
ers. Take, for instance, the 



program m 



i Fig. 6 Our experiments unveil that most compil- 
ers handle it mco rrectly. C lang 4.1 |8| and GCC 4.8 |13| both 
signal an error on |Line IT stating that y cannot be converted 



to X as required by the declaration on |Line 7 The Intel com- 
piler 13.0.1 t22l accepts the code. Microsoft Visual Studio 



1311 compiles the code, but its visual editor indicates an error 
(this is due to the editor using a different frontend than the 
compiler). The Comeau compiler |9|, the only one we tested 
that handles this example correctly, rejects both the opera- 
tor expr ession on Line 7 and the operator function call on 
ILine 111 

Why do mature production-grade compilers produce so 
different results for such a seemingly simple piece of code? 
The specification of binding uses of operators (|51 [, clause 



4 Symbols such as S, N, H and M are simply annotations for differ- 
ent kinds of scopes, based on the wording of the C++ standard, that 
do not change the process of name binding itself, except for how they 
are combined. 
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13.3.1.2) is spread out over several paragraphs and contains 
references to other parts of the C++ specification. The cor- 
rect app roach is to tre at operator function calls such as the 
ones on ILines 10-111 as normal function calls wit h all the 
usual hiding rules. Operator expressions such as on Lines 8- 
[9] should be converted into operator function calls, accord- 
rng to a table given in the specification. Then, the functions 
are to be looked up according to modified lookup rules that 
merge member, non-member, and built-in candidates to- 
gether, which is not the usual approach for functions. We 
have not analyzed the implementations of the different com- 
pilers to understand how the specification has been misun- 
derstood or even if the errors are a known issue, but this 
example shows that a framework like ours can be an invalu- 
able tool for clarifying how operators are looked up. Based 
on the specification, we specify the general scoping rules for 
operator lookup (after the operator is converted into an ap- 
propriate function call) as 



O b <=> M 



«<J? = iSi«C=iNi 

without ADL, or 

(ii <=> U H <=> O b <=> M <=> 

(U H > ( (h <=> U H <=> O b <=> M) <i ADL 

<l (| ^ — i f ADL (Si) <| 4r=l lfn ADL (Ni) < 

I N n <=> Un„ <s*> ((N n <=>U N „) <ADl)) 
with ADL, where 



f adl (X) = X«>Ux« (Uxl>((X<=>UxJ<|ADL) J , 
Si • • • S s = surrounding non-namespace scopes, 
friADL (N) = N<=>U N <=> ((n<=>U n )i>(P<=>Un)<|ADL)) , 

Ni • • • N n = surrounding namespaces, 

H = scope where name binding is triggered from, 
Ob = built-in operators, 
M = member scope of operator's first argument, 
X = non-function (template) declarations in scope X, 
Ux = using declarations in scope X, and 
ADL = associated namespaces of reference's arguments. 

The above expression states that the scope of the call, the 
members of the first operand, and the built-in operators are 
considered first, and, if nothing was found in one of these 
scopes, the surrounding scopes are considered next. 

One can reduce the expression by noting that X must 
be empty since uses of operators lookup operator functions, 
which are named with the reserved keyword operator. 
After reduction, we get 



H <=> U H <=> O b <=> M <=> 

U H |> ((h O b m) < ADL 

<l d j _ i f ADL (Si) <| ^=l lfn ADL (Ni) <| (Nn <=> U N „ <=> ADL) 



and then 

<=> U H <=> O b <=> M <=> 
(U H > ((H <=> O b m) <i ADL)) ) 
<l C=l f ADL (Si) <| (^ =1 (Ni <=> U Ni ) <=> ADL) 

where 

f adl (X) = X <=> U x <=> (U x > (X < ADL)) , 
Si ■ • S s = surrounding non-namespace scopes, 
Ni • ■ • N n = surrounding namespaces, 

H = scope where name binding is triggered from, 
Ob = built-in operators, 

M = member scope of operator's first argument, 
Ux = using declarations in scope X, and 
ADL = associated namespaces of reference's arguments. 

4.3 A Cross-Language Analysis 

Aside from combining scopes using our combinators, one 
can carry on substantial discussions about the process of 
name binding solely based on its properties of match, 
select_best_viable and ambiguity; that is, by sim- 
ply instantiating the Language concept. In this section, we 
demonstrate this with a representative series of comparative 
analyses between selected features of C++ and Haskell. The 
essential thing to notice with these analyses is that we never 
mention language-specific details, unless they are directly 
related to the salient properties. Explicit instantiations, along 
with the implementation of our framework, is available on- 
line QD . 

C++ Types vs. Functions, in Clang: In C++, as in most lan- 
guages, ambiguity is an error by default. So, function calls 
and uses of types at least share that property. However, when 
it comes to matching references to declarations or reduc- 
ing overload sets, at least judging from the implementation 
structure of Clang |8|, they follow different paths. 

For mat ch, both kinds of uses start off checking that dec- 
larations and references share the same name. Then, there 
is a normalization process that removes all duplicates from 
the set of matching declarations, and even filters out unac- 
ceptable template declarations as necessary. The main point 
of difference comes from Clang's implementation of the nor- 
malization, in LookupResult : : resolveKind ( ) . When 
more than one distinct declarations are found, C++ considers 
it an ambiguity if the reference is not a function call; and an 
overload otherwise. So, for non functions, Clang instantly 
marks the lookup result as ambiguous. The effect of this is 
that, only for non-functions, Clang performs some name res- 
olution during name lookup. 

For select_best_viable, functions can be overloaded, 
so this simply consists of overload resolution. Types, on 
the other hand cannot be overloaded. So, this starts off 
by checking for any kind of ambiguity in the lookup re- 
sult. In Clang, name lookup, through match, has already 
performed this check and flagged the lookup result ap- 
propriately. So, select_best_viable simply checks for 
the flag. If the lookup result is either ambiguous or empty, 
select_best_viable is identity and thus immediately 
passes the result through to assess. If not, it performs a 
viability check on the single matching declaration, as nec- 
essary, and either passes the declaration onto assess if the 
viability check succeeds, or nothing otherwise. 
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Y&) 
Yfij 



// 
// 



(1) 
(21 



// (3) 



1 struct X, Y, Z; 

2 void foo (virtual X&, virtual 

3 void foo (virtual Y&, virtual 

unique-base method ** 

4 void foo (virtual Y&, virtual Zs) ; 

5 struct XY : X, Y { } 

6 struct YZ : Y, Z { } 

7 void foo (virtual XY& , virtual Y&) 

overrider for (1) and (2) 
s void foo (virtual Y&, virtual YZS); // 
overrider for (2) and (3) 



(4) 
(5) 



10 XY xy; YZ yz; 

n foo(xy,yz); // both (4) and (5) are equally 
viable matches , with unique base (2) 

Figure 7: C++ multimethods example 

The essential take from this analysis is that by talking 
about the process of name binding in terms of the salient 
properties in the Language concept, we have revealed sub- 
tleties in the implementation of uses of types that do not ap- 
ply to uses of functions; and which may indicate just how 
composable components of their respective implementations 
are. 

When Ambiguity Is Not an Error: It is unusual for ambi- 
guity to not be an error. However, there exists special cir- 
cumstances under which this kind of relaxed property is just 
what a language needs. Take, for instanc e, C++'s proposed 
extension with multimethods |35[. Fig. 7 shows an example 
that is taken straight from the proposal. 

The proposal discusses semantic and engineering bene- 
fits to rel axing th e rules for ambiguity in such a way that 
the call in |Line lT] which is normally considered ambiguous, 
would be considered valid. In essence, it allows ambiguous 
calls when all best viable candidates, called overriders, "have 
a unique base method through which the call can be dis- 
patched". In such cases, it argues for using the unique base 
to seed the runtime dispatch table. 

While the proposal covers greater details than illustrated 
here, we simply use this to illustrate the wide range of appli- 
cability of our framework, as well as to reinforce why ambi- 
g uity is a salient property of the language. For the example 
in |Fig. 7| we find it reasonable to assume that mat ch and 
select_best_viable remain as defined in C++ sans mul- 
timethods. 

Type-Directed Name Resolution vs. Overloading: TDNR 
(50l is a proposed Haskell extension that "exploits the power 
of the dot" to provide an alternative way to disambiguate 
name uses. In essence, if a function call f a matches two 
separate declarations of different types, TDNR allows one to 
select a declaration based on the type of the call argument 
using a syntax like a . f . The similarity of this syntax to that 
of object-oriented languages has raised questions about how 
TDNR relates to the notion of overloading. Granted there has 
been several discussions over this within the Haskell com- 
munity, our framework provides an alternative way to carry 
on the discussions. As a guide, we use it here to show how 
TDNR is not overloading based on the language's definition 
of name binding. 

First and foremost, TDNR does not affect either of the 
s e 1 e c t_b e s t_v iableorambiguity salient properties of 
name resolution, whether type inference is part of the pro- 
cess or not. Therefore, it cannot be overloading in the sense 
of ad-hoc polymorphism. It is not overloading based on type- 
class constraints because the call a . f is subject to the same 
scoping rules as f a, when expressed using our scope com- 



binators l |Sects. 3.1|and 3.2) . The only difference between the 
calls a . f and f a is m the match property. For any function 
call, match typically just checks that the names of declara- 
tions match that of the call. However, for the dotted call a . f , 
it adds a normalization step that filters declarations that do 
not match the type of a out of consideration. The type of a is 
deduced before the binding of a . f is triggered, and is used 
by match for the said binding. So, there is no reason for any 
of the other components to be involved. 

4.4 Compiler Integration 

Our framework can facilitate automated reasoning in com- 
piler design. Compilers can either 

• implement the scoping rules of a given language en tirely 
based on our scope combinators (Sects. 3.1|and 3.2} , or 

• simply extend their cur rent mech anism for name binding 
with only weak hiding jSect. 3.2| . 

Work is currently underway that explores and implements 
all these alternatives for reduced versions of practical lan- 
guages like language-c, featherweight Java, and a mini-C++ 
language that we designed (cf. |6|). Particularly noteworthy, 
an implementation of weak hiding is based on iteratively 
applying current mechanisms for name binding, noting that 
binding a scope expression Si <$=] S2 is equivalent to binding 
Si, followed by binding S2 if unsuccessful. That is, 

bind (ref, (si <=i s 2 )) 

bind (ref, Si) ? bind (ref, Si) : bind (ref,Sz) . 

This implementation is called 2-stage name binding (Bind* 2 ) 
for obvious reasons, has automated capabilities, and is cur- 
rently implemented in ConceptClang |53l . 

5. Discussion and Related Work 

We have introduced a framework to help carry on analyti- 
cal discussions about name binding, and even its implemen- 
tation, without bogging down to unnecessary details. This 
was necessary since a representative mini survey of well- 
known programming languages books shows that there is 
not a widely accepted formalism to reason about it. Scott (111 
offers one of the most complete treatments of name binding. 
He introduces bindings, scope rules, and overloading where the 
results of lookup are followed by a static analyzer choosing 
the best declaration. Scott also frequently uses the term ref- 
erence for name uses. Sebesta ]42| provides a very similar 
discussion. Friedman et al. (12] discuss "scoping and bind- 
ing variables," but do not include overloading. Abelson and 
Sussman |1J introduce names that refer to variables and the 
environment to keep the association between variables and 
their values. Pierce |34[ only discusses name binding implic- 
itly, as components of the specific systems discussed in the 
book. Mitchell 1 32 1 also introduces the notion of scopes, but 
his description is mostly about implementation aspects. In 
summary, there is no standard way to specify name lookup - 
or name binding - independently of language details. 

Applications of our framework are not limited to lexical 
scoping. In fact, for dynamic scoping, we can use our scope 
combinators to express the scoping rules for binding any 
given reference in a manner similar to the formula tions pre- 
sented in this paper (e.g. Eqs. |H3||Sects. 4.1 and 4.2| . The only 
difference from lexical scoping is that the scopes are collected 
and combined based on the calling stack rather than the lex- 
ical visibility of names when they are parsed. 

Our framework provides a unified way to specify name 
binding, but can also benefit from formal extensions based 
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on previous and related work l2T1l23ll43ll44ll56ll57H59ll60l , 
as well as structural extensions based on ongoing work (6). 
As structural extensions, we may consider alternative defi- 
nitions of the scope combinators or of the Language con- 
cept, including a different compositional view of name bind- 
ing altogether. As formal extensions, it helps to note that of 
all previous and related work that we have surveyed, only 
one has actually defined semantics for checking name uses 
in restricted scopes |58. 60 1. This work builds up on the other 
works on formalizing concepts and characterizes the check- 
ing of name uses within restricted scopes as a name lookup 
problem. It then formalizes the said checking as consisting of 
name lookup within the constraints only. As such, in relation 
to our framework, we find it limited in three ways. First, 

1. it is based on a specific design of concepts, the pre- 
Frankfurt design |5|, for a specific language, C++. Second, 

2. it restricts the checking of name uses within constraints 
to name lookup within the constraints only, thereby ig- 
noring outer scopes. Third, 

3. it does not support the weak hiding scoping rule, and per- 
haps only implicitly supports the opening scoping rule. 

Moving forward, we may combine ideas from this work and 
(58l [60 1 and the formalization of C++ templates |44| to at 
least formalize weak hiding or Bind* 2 . Alternatively, and for 
the other unconventional scoping rule of opening, we may 
consider extending System F, or even System F° |43| with 
all novel components of our framework. 

Upon further investigation, we have noted that a large 
portion of related work comes from the area of name analysis 
or name management f7i ri0l[TTl[T7ll24H^[^[^l36ti40ll48l[55l . 
However, this area tends to take fundamentally different 
approaches than necessary for our framework. In particular, 
it is centered around issues with engineering name binding 
(e.g. 1 29 55[), while our framework is more interested in 
expressing name binding. A representative survey of related 
work follows. 

Attribute grammars (28l naturally support the notion 
of symbol table construction through grammar attributes. 
Since introduction by Knuth, attribute grammars have been 
used in many variations. For example, Rewritable Refer- 
ence Attribute Grammars (ReRAGs) tlOl have been used 
for construction of symbol tables for context-sensitive lan- 
guages and restructuring these symbol tables for refactoring 
of names. ReRAGs have been used to define sound renam- 
ing for Java l40l . The basic tenet of this work is the abil- 
ity to provide provably correct inversion of name lookup 
matching declarations to name uses from the definition of 
name lookup alone. Ekman and Hedin 1 11 1 demonstrate how 
ReRAGs can be used to structure name analysis in a declara- 
tive and modular way, structuring the implementation sim- 
ilarly to the specification. They deal with similar issues that 
our framework tackles, such as ambiguity and nested scopes 
with name hiding. Their work, however, does not provide a 
formal way to consider and specify lookup independently of 
implementations; they are more concerned with a modular 
implementation than with a tool to reason about designs for 
scoping rules. Kastens and Waite |26| tackle the problem of 
name analysis by encapsulating the basic operations neces- 
sary to define it in an abstract data type (ADT). Their ADT 
is a lookup table with scoping rules, providing operations to 
create scopes, insert and lookup names, and so on. The ADT 
can then be used in an attribute grammar where contextual 
analysis is specified. Our framework could also be used in 
a similar way, but we separate Kasten and Waite's ADT into 



scopes, scope ordering relations, and an ambiguity design. 
Furthermore, our framework provides a complete solution 
for describing all features of name analysis, including over- 
loading, while their framework is only concerned with name 
lookup and is supposed to be used with a separate over- 
loading module. Kasten and Waite's ADT is used in the Eli 
compiler construction system |17[, which provides a library 
of modules for interpreting the lookup table created in an 
attribute grammar (e.g., multiple inheritance module). 

Reiss 1 38 [ provides a complete system for "symbol pro- 
cessing" in the ACORN automatic compiler production 
project 1 37 1 based on attribute grammar approach. Reiss pro- 
vides a formal name binding model with some components 
similar to ours, such as LOOK_UP and RESOLVE functions. 
Symbol tables for particular languages are specified using 
an abstract description language. A symbol table can then 
be processed either through a low-level interface or through 
a domain-specific language in an attribute grammar frame- 
work. Reiss's system is similar to ours in that it recognizes a 
formal model for name lookup, which shares many common 
elements with our framework. In difference to our approach, 
Reiss's system is meant for implementing a language rather 
than abstracting from it. Because of that, even the abstract 
description of a symbol lookup module resembles imple- 
mentation of a compiler. Our framework is better suited for 
design and analysis, yet it still provides a useful implemen- 
tation framework. 

Name management l24l l25l has been introduced as one 
of the important computer science issues for 1980's 1 39 1 and 
was defined to be the means for establishing names for ob- 
jects, accessing the objects using their names, and control- 
ling the availability and the meaning of names in different 
scopes. Name management shares some similar basic termi- 
nology with our approach, but its goals are quite different. 
In particular, name management is meant to alleviate the 
issues of interoperability and performance, and, as such, is 
more concerned with modeling of implementation of names, 
objects, and access, introducing notion of time of scope du- 
ration, for instance. Name management could be used as a 
general "backend" for our framework instead of a particu- 
lar programming language, providing a layer of insulation 
between our abstractions and particular languages. 

Power and Malloy |36 | describe name lookup in C++ as 
a collection of UML |7| artifacts. Their description concen- 
trates on name lookup explicitly, and their goal is to simplify 
implementation of future C++ tools by describing one of the 
hard implementation issues on the high level of a UML spec- 
ification. While they isolate name lookup like we do, their 
primary concern is specification of implementations of name 
lookup for C++ rather than an abstract framework for making 
of design decisions. 

Sulzmann [48 [ introduced a general framework for Hindley- 
Milner type systems with constraints. His work allows con- 
sideration of various language extensions as instantiations of 
the general framework, isolating the concern of a constraint 
system from the rest of the language. The framework is based 
on an idealized formal Hindley-Milner system parameter- 
ized by a constraint system. Our work is parameterized in 
the opposite direction: we provide a framework that is built 
on an abstraction of a language contained in an opaque in- 
terface. Our framework is applied by taking a language and 
providing appropriate instances in our framework, includ- 
ing a "compiler" component that generates scope descrip- 
tors for references. Then, the language uses our framework 
for name binding. 
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Konat et al. l29l pursue similar objectives to our work, i.e., 
describing name binding and scoping rules in a language- 
parametric fashion. However, their work quickly dives into 
implementation details that are not necessary for our pur- 
poses. In particular, in addition to abstracting references, dec- 
larations and scopes, they abstract namespaces and imports. 
In our framework, namespaces are simply scopes that do not 
need special consideration, except for clarifying scope ex- 
pressions with respect to the wording of the standard doc- 
ument; and imports are expressed via the scope combina- 
tors - usually merging - based on the language. Moreover, 
using their framework requires defining binding rules in 
the Stratego rewriting language l52l or for the Spoofax lan- 
guage workbench 1 27], thereby specifying details in the kinds 
of scopes (e.g. namespaces, classes, etc.), declarations (e.g. 
classes, methods, variables, imports, unique, etc.), or ref- 
erences (e.g. uses of classes, types, variables, fields, etc.). 
In contrast, using our framework, one can talk about differ- 
ent ways to perform name binding, especially in restricted 
scopes, without defining such binding rules. There are two 
other points of differences. One point is that their framework 
is limited to lexical scoping whereas ours is not. The other 
point is that they abstract from the process of name binding, 
defining it as dependant on some language-specific "name 
resolution strategy"; while we segment the process further into 
two distinct components of lookup and resolution, which al- 
lows us to distinctly define the weak hiding scoping rule (in- 
dependently of the language). Both frameworks may share 
similar goals, but the fundamental motivations and struc- 
tures are different yet likely complementary. In fact, the level 
of detail that Konat et al.'s framework addresses may be nec- 
essary to instantiate our Language concept. 

6. Conclusion 

We have designed a framework for understanding and spec- 
ifying name binding, independently of language-specific de- 
tails. The framework abstracts the fundamental notions of 
references, declarations and scopes, and consists of two layers 
of abstractions: the first layer allows to express scoping rules 
using four combinators that we have defined, named hiding, 
merging, opening and weak hiding; and the second layer allows 
to express other properties of name binding not captured 
by the combinators that we abstracted in a concept named 
Language. Using this framework, one can carry on analy- 
sis and experimentations of name binding across different 
languages, features, and even kinds of references in a uni- 
fied and structured fashion. For instance, we uncovered mis- 
interpretations of the specifications for ADL and uses of op- 
erators in C++, concisely expressed what specifications actu- 
ally say for various features like ADL (in C++) and TDNR (in 
Haskell), and performed structured cross-language analysis. 

One particular scoping rule that we defined, weak hiding, 
is new in programming languages and useful to ensure the 
backward compatibility of C++ programs as C++ transitions 
into ConceptC++. Supporting weak hiding in compilers entails 
a new mechanism for name binding, Bind* 2 , which simply 
consists of an iterative process over current mechanisms for 
name binding. Bind* 2 is currently implemented in Concept- 
Clang, as weak hiding is considered essential for implement- 
ing concepts for C++. 

As evidenced in this paper, in an ideal world, specifi- 
cations could be formulated based on our framework, and 
compilers could unambiguously implement name binding 
using such formulation. 
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